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(57) ABSTRACT 

A distributed (e.g., client/server) computing environment is 
described which implements protocol methodology improv- 
ing the streaming of objects, such as for distributed appli- 
cations. In particular, the methodology facilitates streaming 
of objects (e.g., Java objects) stored and managed remotely 
(e.g., objects stored and managed in relational databases) to 
clients in a highly efficient manner. The methodology may 
be implemented by extending an existing streaming meth- 
odology or protocol to include a class identifier approach for 
supporting object serialization. A Class ID (AO) serializa- 
tion is provided as a protocol for converting between a java 
object and a binary representation. ACI is intended for an 
environment in which all classes ever involved in any 
serialization are known by the environment (as is often the 
case). Each class known to the environment is represented 
by a compact numeric identifier, and it is this identifier alone 
that is used to represent the class description in the serial- 
ization. A table of the class identifiers is kept at the begin- 
ning of each serialization. A simple transformation is applied 
to achieve portability, so that any ACI serialization can be 
converted to a portable serialization, a Class Descriptor 
serialization (ACD). The ACD is identical to ACI except that 
the class identifier table beginning ACI is replaced by a table 
of class descriptors. These class descriptors contain virtually 
the same information as standard (e.g., Sun) class 
descriptors, so an ACD serialization has the same portability 
characteristics as Sun serialization. In this manner, the 
present invention provides the ability to create and stream 
objects, particularly Java objects, in a manner which does 
not incur a substantial size or resource penalty. 

18 Claims, 6 Drawing Sheets 
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SYSTEM AND METHOD FOR IMPROVED 
SERIALIZATION OF JAVA OBJECTS 

RELATED APPLICATIONS 

The present application claims the benefit of priority from 
and is related to the following commonly-owned U.S. appli- 
cation: application Ser. No. 60/127,653, entitled System and 
Method for Improved Serializarion of Java Objects, filed 
Apr. 2, 1999. The present application is also related to the 
following commonly-owned U.S. application: application 
Ser. No. 09/233365, entitled System and Method for Seri- 
alizing Java Objects in a Tabular Data Stream, filed Jan. 19, 
1999 and now U.S. Pat. No. 6,356,946. The disclosures of 
the foregoing applications are hereby incorporated by ref- 
erence in their entirety, including any appendices or attach- 
ments thereof, for all purposes. 

COPYRIGHT NOTICE 

A portion of the disclosure of this patent document 
contains material which is subject to copyright protection. 
The copyright owner has no objection to the facsimile 
reproduction by anyone of the patent document or the patent 
disclosure as it appears in the Patent and Trademark Office 
patent file or records, but otherwise reserves all copyright 
rights whatsoever. 

BACKGROUND OF THE INVENTION 

The present invention relates generally to data access and 
processing in a distributed computing system and, more 
particularly, to a system implementing methodology for 
improving data streaming of objects in distributed computer 
environments. 

Computers are very powerful tools for storing and pro- 
viding access to vast amounts of information. Computer 
databases are a common mechanism for storing information 
on computer systems while providing easy access to users. 
A typical database is an organized collection of related 
information stored as "records" having "fields" of informa- 
tion. As an example, a database of employees may have a 
record for each employee where each record contains fields 
designating specifics about the employee, such as name, 
home address, salary, and the like. 

Between the actual physical database itself (i.e., the data 
actually stored on a storage device) and the users of the 
system, a database management system or DBMS is typi- 
cally provided as a software cushion or layer. In essence, the 
DBMS shields the database user from knowing or even 
caring about underlying hardware -level details. Typically, 
all requests from users for access to the data are processed 
by the DBMS. For example, information may be added or 
removed from data files, information retrieved from or 
updated in such files, and so forth, all without user knowl- 
edge of underlying system implementation. In this manner, 
the DBMS provides users with a conceptual view of the 
database that is removed from the hardware level. The 
general construction and operation of a database manage- 
ment system is known in the art. See e.g., Date, C, An 
Introduction to Database Systems, Volume I and II, Addison 
Wesley, 1990; the disclosure of which is hereby incorporated 
by reference. 

DBMS systems have long since moved from a centralized 
mainframe environment to a de-centralized or distributed 
environment. One or more FC "client" systems, for instance, 
may be connected via a network to one or more server-based 
database systems (SQL database server). Well-known 
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examples of computer networks include local-area networks 
(LANs) where the computers are geographically close 
together (e.g., in the same building), and wide-area networks 
(WANs) where the computers are farther apart and are 

5 connected by telephone lines or radio waves. 

Often, networks are configured as "client/server" 
networks, such that each computer on the network is either 
a "client" or a "server." Servers are powerful computers or 
processes dedicated to managing shared resources, such as 

10 storage (i.e., disk drives), printers, modems, or the like. 
Servers are often dedicated, meaning that they perform no 
other tasks besides their server tasks. For instance, a data- 
base server is a computer system that manages database 
information, including processing database queries from 

35 various clients. The client part of this client-server architec- 
ture typically comprises PCs or workstations which rely on 
a server to perform some operations. Typically, a client runs 
a "client application" that relies on a server to perform some 
operations, such as returning particular database informa- 

20 tion. Often, client -server architecture is thought of as a 
"two-tier architecture," one in which the user interface runs 
on the client or "front end" and the database is stored on the 
server or "back end" The actual business rules or application 
logic driving operation of the application can run on either 

25 the client or the server (or even be partitioned between the 
two). In a typical deployment of such a system, a client 
application, such as one created by an information service 
(IS) shop, resides on all of the client or end-user machines. 
Such client applications interact with host database engines 

30 ( e *S-> Sybase® Adaptive Server™), executing business logic 
which traditionally ran at the client machines. 

More recently, the development model has shifted from 
standard client/server or two-tier development to a three-tier 
(or n-tier), component-based development model. This 

35 newer client/server architecture introduces three well- 
defined and separate processes, each typically running on a 
different platform. A "first tier" provides the user interface, 
which runs on the user's computer (i.e., the client). Next, a 
"second tier" provides the functional modules that actually 

40 process data. This middle tier typically runs on a server, 
often called an "application server." A "third tier" furnishes 
a database management system (DBMS) that stores the data 
required by the middle tier. This tier may run on a second 
server called the database server. 

45 The three-tier design has many advantages over tradi- 
tional two-tier or single- tier designs. For example, the added 
modularity makes it easier to modify or replace one tier 
without affecting the other tiers. Separating the application 
functions from the database functions makes it easier to 

50 implement load balancing. Thus, by partitioning applica- 
tions cleanly into presentation, application logic, and data 
sections, the result will be enhanced scalability, reusability, 
security, and manageability. 

In a typical client/server environment, the client knows 

55 about the database directly and can submit a database query 
for retrieving a result set which is generally returned as a 
tabular data set. In a three-tier environment, particularly a 
component-based one, the client never communicates 
directly with the database. Instead, the client typically 

60 communicates through one or more components. Compo- 
nents themselves are defined using one or more interfaces, 
where each interface is a collection of methods. In general, 
components return information via output parameters. In the 
conventional, standard client/server development model, in 

65 contrast, information is often returned from databases in the 
form of tabular result sets, via a database interface such as 
Open Database Connectivity (i.e., ODBC, available from 
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Microsoft Corp. of Redmond, Washington) or Java Database What is desired is a solution providing the ability to create 

Connectivity (i.e., JDBC, available from Sun Microsystems and stream objects, particularly Java objects, in a manner 

of Mountain View, Calif.). A typical three-tier environment which does not incur a considerable size or resource penalty, 

would, for example, include a middle tier comprising busi- Moreover, such a solution should preserve portability. The 

ness objects implementing business rules (logic) for a par- 5 present invention fulfills this and other needs, 

ticular organization The business objects, not the client, SUMMARY OF THE INVENTION 
communicates with the database. 

For their part, application writers or developers like to A distributed (e.g., client/server) computing environment 

write object-oriented programs using modern object- is descnbed which, in accordance with the present invention, 

oriented programming techniques. At the same time, 1rt simplifies the use of objects in distributed applications or 

however, these developers prefer to have their data (i.e., the other stances where transfer of objects is required. In 

data employed by the application) stored in a database Particular, ^ invention provides an improved methodology 

having relational tables, as that is an easy way of storing and for breaming objects (e.g., Java objects) stored and managed 

retrieving data. A particular problem arises when one wants remotely (e.g objects stored and managed m relational 

to retrieve data from the database for use (e.g., „ 10 m a ^gWy efficient manner. Once at the 

... x ... , , . , . « a .» j . 35 clients, the objects may be executed or otherwise manipu- 

mampulation) withm one's program: how is this "flat" data lated locany as 

converted into objects. In this regard, "object" refers to the ™ , . 4 . ' , . . t . . ... 

J . . * £ . i c ■ * j j » The present invention may be implemented by extending 

specific programming construct that defines associated data an ^ streami me J odol ^ or tocoU such a * 

members and methods (typically, including data hiding and Sybase Xabulaf Data Slream (TDS) ^ 0f Qther CQm . 

containment), such as an object instantiated from a C++ 2Q parable streaming protocol. Streaming is modified to include 

class, a Java class, an Object Pascal class, or the like. a class identifier approach of the present invention for 

One of the advantages of Java as an object-oriented supporting object serialization. A Class ID (referred hereaf- 

language over C++ is in Java's ability to flatten objects into ter as ACI) serialization is provided as a protocol for 

a standard binary representation. This ability to flatten converting between a java object and a binary representa- 

objects allows the persisting of objects in files or databases, 25 tion. Like Sun serialization, it operates to provide object 

or transmission of objects between applications across a serialization. Unlike Sun serialization, however, the class 

network. Because the representation is standard, applica- description required in ACI is dramatically less, thereby 

tions written by different vendors can exchange objects minimizing the time penalty and storage requirements usu- 

without having to revert to a proprietary protocol. This a u y required to represent class description information in a 

standard representation was developed by Sun Microsys- 30 stream. 

terns and will be referred to herein as Sun serialization. A q is intended for an environment in which all classes 

Sun serialization is a protocol for converting between a ever involved in any serialization are known by the envi- 

Java object and its binary representation. The binary repre- ronment (as is often the case). Each class known to the 

sentation is an array of bytes coded to represent the Java environment is represented by a compact numeric identifier, 

object using the Sun serialization protocol. How the Java 35 and it is this identifier alone that is used to represent the class 

object is represented within its particular host virtual runt- description in the serialization. A table of the class identifiers 

ime environment (virtual machine or VM) is irrelevant to its is kept at the beginning of each serialization. ACI is much 

binary encoding, A Java object itself is a collection of data smaller but, without further enhancement, the approach 

fields whose values are interpreted by a Java class. The Java would be at the expense of portability. In accordance with 

class of an object may specify one or more named typed 40 the present invention, however, a simple transformation is 

fields, whose values are contained in the object. Java classes applied so that any ACI serialization can be converted to a 

can be "subclassed," meaning other classes can inherit the portable serialization. 

named typed fields of a particular class, and provide addi- Class Descriptor serialization (ACD) is also provided for 
tional named typed fields. achieving portability. The ACD is identical to ACI except 
When an object is serialized using Sun serialization, a 45 that the class identifier table beginning ACI is replaced by a 
description of the object's class is serialized along with it. table of class descriptors. These class descriptors contain 
The class description is the template that allows the object virtually the same information as Sun class descriptors, so an 
to be reconstructed. Such a template allows a meaningful ACD serialization has the same portability characteristics as 
interpretation of the object's data, without which the data Sun serialization. To convert between ACI and ACD seri- 
would just be a stream of bytes. The class description 50 alizations is a very simple and computationally frugal pro- 
includes details of the class field names and types. With this C ess. Because both are otherwise identical (apart from the 
information, another goal is achieved: the description acts as class identifier tables), only the class table contents need 
versioning information. Classes can be modified over time change. The environment maintains a correspondence 
as the development process dictates, and an object serialized between the ACI class identifiers and ACD class descriptors, 
under an earlier version of a particular class must be able to 55 In this manner, the present invention provides the ability to 
"deserialized" as a newer version of the class. This would, create and stream objects, particularly Java objects, in a 
in general, be impossible without the serialization's inclu- manner which does not incur a substantial size or resource 
sion of class field names and types. When deserialization penalty. 

takes place the old class description can be compared to the INSCRIPTION OF THF DRAWINGS 

newer description and fields can be mapped as needed. 6 o BRIbh DbSCRIpnoN °^ mfc < DRAWINGS 

The inclusion of the detailed class description in the FIG. 1 A is a block diagram of a computer system in which 

object serialization makes those serializations portable and the present invention may be embodied, 

"versionable." Unfortunately, however, this is done today at FIG. IB is a block diagram of a software system for 

the expense of sometimes considerable size required to controlling operation of the computer system of FIG. 1A. 

represent the descriptions. A time penalty also results, from 65 FIG. 2 is a block diagram of a distributed computer 

the time taken to write the description. Accordingly, a belter environment (which includes the computer system of FIG. 

solution is sought. 1A) in which the present invention is preferably embodied. 
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FIGS. 3A-E present flowchart illustrating an object seri- tecture 220. Database Driver architecture 220 itself includes 

alization method of the present invention. a Java SQL driver interface 221 and manager 223 (e.g., the 

DFTAIT ED DESCRIPTION OF A PRFFFRRFD Java SQL drivcr matiager V royid * d b ^ Sun Microsystems), 

DETAILED DESCR1F HON OTA PREFERRED fof managing one or more JDBC drivers> ^ ^ jdbc 

EMBODIMENT 5 drivef 325 ^ Sy5ase JConnect™ JDBC driver provided 

The following description will focus on the presently- by Sybase, Inc.). In an exemplary embodiment, the Clients 

preferred embodiment of the present invention, which is may themselves include thin-client applications (e.g., Java 

operative in a distributed computing environment executing programs) running on standalone workstations, dumb 

application programs which interact with remote data, such terminals, or personal computers (PCs), such as the above - 

as that which is stored on an SQL database server. The 10 described system 100. Typically, such units would operate 

present invention, however, is not limited to any particular under a client operating system, such as Microsoft Windows 

application or environment. Instead, those skilled in the art 9x for PC clients. Each Back End Server, such as Sybase® 

will find that the present invention may be advantageously SQL Server™, now Sybase® Adaptive Server™ (available 

applied to any application or environment where optimiza- from Sybase, Inc. of Emeryville, Calif.) in an exemplary 

tion of object data access and processing is desirable, is embodiment, generally operates as an independent process 

including non-SQL database management systems and the (i.e., independently of the Clients), running under a server 

like. The following description is, therefore, for the purpose operating system such as Microsoft Windows NT (Microsoft 

of illustration and not limitation, Corp. of Redmond, Wash.), NetWare (Novell of Provo, 

Standalone System Hardware Utah), UNIX (Novell), or OS/2 (IBM). Here, the compo- 

The invention may be embodied on a computer system 20 nenls of the system communicate over a network which may 

such as the system 100 of FIG. 1A, which comprises a be any one of a number of conventional network systems, 

central processor 101, a main memory 102, an input/output including a Local Area Network (LAN) or Wide Area 

controller 103, a keyboard 104, a pointing device 105 (e.g., Network (WAN), as is known in the art (e.g., using Ethernet, 

mouse, track ball, pen device, or the like), a screen display IBM Token Ring, or the like). The network includes func- 

device 106, and a mass storage 107 (e.g., hard or fixed disk, 25 tionality for packaging client calls in the well-known SQL 

removable disk, optical disk, magneto-optical disk, or flash (Structured Query Language) together with any parameter 

memory). Processor 101 includes or is coupled to a cache information into a format (of one or more packets) suitable 

memory 109 for storing frequently accessed information; for transmission across a cable or wire, for delivery to the 

memory 109 may be an on-chip cache or external cache (as database servers. 

shown). Additional output device(s) 108, such as a printing 30 Client/server environments, database servers, and net- 
device, may be included in the system 100 as desired. As works are well documented in the technical, trade, and 
shown, the various components of the system 100 commu- patent literature. For a discussion of database servers and 
nicate through a system bus 110 or similar architecture. In a client/server environments generally, and SQL Server™ 
preferred embodiment, the system 100 includes an IBM- particularly, see, e.g., Nath, A., The Guide to SQL Server, 
compatible personal computer system, available from a 35 Second Edition, Addison- Wesley Publishing Company, 
variety of vendors (including IBM of Armonk, New York). 1995. Additional documentation of SQL Server™ is avail- 
Standalone System Software able from Sybase, Inc. as SQL Server Documentation Set 
Illustrated in FIG. IB, a computer software system 150 is (Catalog No. 49600). The disclosures of each of the fore- 
provided for directing the operation of the computer system going are hereby incorporated by reference. 
100. Software system 150, which is stored in system 40 Improved serialization of Java objects 
memory 102 and on mass storage or disk memory 107, A. Introduction 

includes a kernel or operating system (OS) 140 and a It is known in the art to employ a communication protocol 

windows-based GUI (graphical user interface) shell 145. for effecting communication between database components, 

One or more application programs, such as application such as between a client front end and a database server back 

software programs 155, may be "loaded" (i.e., transferred 45 end. Typically, such communication protocols include native 

from storage 107 into memory 102) for execution by the support for traditional SQL (e.g., ANSI SQL-92) data types, 

system 100. The system also includes a user interface 160 such as character (char), variable-length character (vchar), 

for receiving user commands and data as input and display- binary (blob), date-time, time stamp, together with some 

ing result data as output. support for vendor-specific data types. 

Also shown, the software system 150 includes a Rela- 50 One example is Sybase® "Tabular Data Stream" which 

tional Database Management System (RDBMS) front-end provides a communication protocol for effecting communi- 

or "client" 170. The RDBMS client 170 may be any one of cation between Sybase-branded database products. Tabular 

a number of database front-ends, including PowerBuilder™, Data Stream (TDS) is an application-level protocol used to 

Sybase PowerJ™, PowerC++™, Borland Paradox®, send requests and responses between clients and servers. A 

Microsoft® Access, or the like, and the front-end may 55 client's request may contain multiple commands. The 

include SQL access drivers (e.g., Sybase JConnect™ JDBC response from the server may return one or many result sets, 

driver or the like) for accessing database tables from an SQL TDS relies on a connection-oriented transport service, 

database server operating in a Client/Server environment. Session, presentation, and application service elements are 

Client/Server Database Management System provided by TDS. Since TDS does not require any specific 

While methods of the present invention may operate 60 transport provider, it can be implemented over multiple 

within a single (standalone) computer (e.g., system 100 of transport protocols if they provide connection-oriented ser- 

FIG. 1A), the present invention is preferably embodied in a vice. TDS provides support for login capability negotiation, 

distributed computer environment, such as illustrated in authentication services, and support for both database spe- 

FIG. 2. Distributed computing environment 200 includes cific and generic client commands. Responses to client 

JDBC-enabled client application(s) 210 (e.g., clients 211, 65 commands are returned using a self-describing, table- 

213, 215), connect to a back-end database servers) 230 oriented protocol. Column name and data type information 

(e.g., servers 231, 233, 235) via a Database Driver archi- is returned to the client before the actual data is returned. 
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The TDS protocol is mostly a token-based protocol where 
the contents of a Protocol Data Unit (PDU) are tokenized. 
The token and its data stream describe a particular command 
or part of a result set returned to a client. For example, there 
is a token called TDS_LANGUAGE which is used by a 
client to send language, typically SQL, commands to a 
server. There is also a token called TDS_ROWFMT which 
describes the column name, status, and data type which is 
used by a server to return column format information to a 
client. The TDS protocol is half-duplex. A client writes a 
complete request and then reads a complete response from 
the server. Requests and responses cannot be intermixed and 
multiple requests cannot be outstanding. 

A TDS request or response may span multiple PDUs. The 
size of the PDU sent over the transport connection is 
negotiated at dialog establishment time. Each PDU contains 
a header, which is usually followed by data. A PDU header 
contains information about the size and contents of the PDU 
as well as an indication if it is the last PDU in a request or 
response. 

As an illustration of this protocol, consider, for example, 
the SQL statement; "select name from sysobjects where 
id<3". The following will illustrate a high-level description 
of the TDS tokens exchanged by a client and a server to 
establish a dialog and then execute a simple SQL query. The 
query causes two table rows to be returned to the client. The 
client first requests a transport connection to the server and 
then sends a login record to establish a dialog. The login 
record contains capability and authentication information. 



Client 




Server 


login packet 










TDS_LOGINACK 




< — 


TDS_DONE 
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Server 


LANGUAGE: "select name. . ." 
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Now that a dialog has been established between the client 
and the server, the client sends the SQL query to the server 
and then waits for the server to respond. 



40 
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The server executes the query and returns the results to the 
client. First the data columns are described by the server, 
followed by the actual row data. A completion token follows 
the row data indicating that all row data associated with the 
query has been returned to the client. 



B. Support for Storing Java Objects as Column Data in a 
Table 

1. General 

SQL databases may be modified to directly store Java 
objects as column data in a database. In this manner, the user 
is able to create queries (e.g., in SQL) that has predicates that 
refer to individual objects, including their individual fields 
and methods, in an extended form of SQL. In a database, 
Java classes are treated as data types, and a column can be 
declared with a Java class as its data type. The corresponding 
JDBC access driver supports storing Java objects in a 
database by implementing setObject( ) methods and 
getObject( ) methods. This makes it possible to use the 
JDBC driver with an application that uses native JDBC 
classes and methods to directly store and retrieve Java 
objects as column data. The following describes the require- 
ments and procedures for storing objects in a table and 
retrieving them using the JDBC driver in the system of the 
present invention. 

2. Prerequisites for Storing Java Objects As Column Data 
In order to store Java objects belonging to a user-defined 

Java class in a column, the following requirements should be 
met. First, the class should implement the java.io.Serializ- 
able interface. This is because the JDBC driver in a preferred 
embodiment employs the native Java serialization and dese- 
rialization to send objects to a database and receive them 
back from the database. Second, the class definition should 
be installed in the destination database. Finally, the client 
system should have the class definition in a class file that is 
accessible through the local CLASSPATH environment vari- 
able. 

3. Sending a Java Object to a Database 

To send an instance of a user-defined class as column data, 
one employs one of the following setObject( ) methods, as 
specified in the PreparedStatement interface: 
void setObject(int parameterlndex, Object x, int 

targetSqlType, int scale) throws SQLException; 
void setObjecl(int parameterlndex, Object x, int 

targetSqlType) throws SQLException; 
void setObject(int parameterlndex, Object x) throws 

SQLException; 
The following example defines an Address class, shows the 
definition of a "Friends" table that has an Address column 
whose data type is the Address class, and inserts a row into 
the table. 



Although the above-described communication protocol is 
employed in the preferred embodiment, the present inven- 
tion may be implemented using any comparable data stream- 
ing protocol. Regardless of the communication protocol 
employed, the protocol is extended in accordance with the 
present invention to support object-based data types, such as 
Java objects, thus allowing these objects to become full class 
SQL objects. 



50 



55 



60 



65 



public class Address implements Serial izable 
{ 

public String streetNumber; 

public String street; 

public String apartmentNumber; 

public String city; 

public int zipCode; 

//Methods 

}" 

Friends table: 
varchar (30) firstname, 
varchar (30) lastname, 
Address address, 
varchar (15) phone) 

// Connect to the database containing the Friends table. 
Connection conn - 

DriverManager.getConnection( M jdbc:sybase:Tds:locaIhost:5000", 

"user name", "password"); 
// Create a Prepared Statement object with an insert statement 
//for updating the Friends table. 
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-continued 



-continued 



PreparedStatement ps = conn.prepareStatement("INSERT INTO Friends 

values (?,?,?,?)"); 
// Now, set the values in the prepared statement object, ps. 
// set firstname to "Joan." 
ps.setString(l, "Joan"); 
// Set last name to "Smith." 
ps.setString(2, "Smith"); 

// Assuming that we already have "Joan__address'* as an instance 

// of Address, use setObject(int parameterlndex, Object x, int // 

targe tSqlType) to set the address column to "Joan_address." 

// Note that the targetSqlType is java.sgl.types.JAVA_OBJECT, with a // 

designated integer value of "2000." 

ps.setObject(3, Joan_address, 2000); 

// Set the phone column to Joan's phone number. 

ps.setString(4, "123-456-7890"); 

// Perform the insert 

ps.executeupdate( ); 



15 



4, Receiving a Java Object from the Database 
A client JDBC application can receive a Java object from 
the database in a result set or as the value of an output 
parameter returned from a stored procedure. If a result set 
contains a Java object as column data, one may employ the 
following getobject( ) methods in the ResultSet interface to 
assign the object to a class variable. 
Object getObject(int columnlndex) throws SQLException; 
Object getObject(String cohimnName) throws SQLExcep- 
tion; 

If an output parameter from a stored procedure contains a 
Java object, one may employ the following getObject( ) 
method in the CallableStatement interface to assign the 
object to a class variable. 

Object getObject(int parameterlndex) throws SQLExcep- 
tion; 

The following example illustrates the use of 
ResultSet ,getObject(int columnlndex) to assign an object 
received in a result set to a class variable. The example uses 
the Address class and Friends table of the previous section 
and presents a simple application that prints a name and 40 
address on an envelope. 



*• This application takes a first and last name, gets the 
** specified person's address from the Friends table in the 
** database, and addresses an envelope using the name and 
** retrieved address. 
*/ 

public class Envelope 

{ 

Connection conn - null; 
String firstName - null; 
String lastName - null; 
String street - null; 
String city - null; 
String zip - null; 

public static void main(String[ ] args) 
{ 

if (args. length <2) 
{ 

Systcm.out.println ("Usage: Envelope <fnstName> 

<lastName>");<lastName>"); 
System.exit(l); 

} 

// create a 4" x 10" envelope 
Envelope e - new Envelope(4 t 10); 
try 
{ 

// connect to the database with the Friends table, 
conn - DriverManager.getConnection( 
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"jdbcasybase:Tds:localhost:5000", "useraame", 

"password"); 
// look up the address of the specified person 
firstName = orgs[0]; 
lastName = args[l]; 

PreparedStatement ps = conn.prepareStatement( 

"SELECT address FROM friends WHERE " + 
"firstname - ? AND lastname - ?"); 
ps.setString(l, firstName); 
ps.setString(2, lastName); 
ResultSet rs » ps.executeQuery( ); 
if (rs.next( )) 
{ 

Address a - (Address) rs.getObject(l); 
// set the destination address on the envelope 
e,setAddrcss(firstName, lastName, a); 
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} 

conn.close( ); 

} 

catch (SQLException sqe) 
{ 

sqe.printStackTrace( ); 
System.exit(2); 

} 

// if everything was successful, print the envelope 
e.print( ); 

private void setAddress (String fname, String lname, Address a) 

{ 

street - a.streetNumber + " ** + a.street + " " + 

a.apartmentNumber, 

city = a. city; 

zip = " " + a.zipCode; 

} 

private void print( ) 
{ 

// Print the name and address on the envelope. 

} 
} 



C. Class Descriptor-based Serialization 

1. Class ID Serialization 

As described above, the inclusion of the detailed class 
description in the object serialization makes those serializa- 
tions portable, and versionable at the expense of the some- 
times considerable size required to represent the descrip- 
tions. A time penalty also results, from the time taken to 
write the description. 

In accordance with the present invention, a class identifier 
approach is introduced for supporting object serialization. A 
Class ID (referred hereafter as ACI) serialization is provided 
as a protocol for converting between a java object and a 
binary representation. Like Sun serialization, it operates to 
provide object serialization. Unlike Sun serialization, 
however, the class description required in ACI is dramati- 
cally less. 

ACI is intended for an environment in which all classes 
ever involved in any serialization are known by the envi- 
ronment (as is often the case). Each class known to the 
environment is represented by a compact numeric identifier, 
and it is this identifier alone that is used to represent the class 
description in the serialization. A table of the class identifiers 
is kept at the beginning of each serialization. ACI is much 
smaller but, without further enhancement, the approach 
would be at the expense of portability. In accordance with 
the present invention, however, a simple transformation is 
applied so that any ACI serialization can be converted to a 
portable serialization. 

2. Class Descriptor Serialization 

Class Descriptor serialization (ACD) is identical to ACI 
except that the class identifier table beginning ACI is 
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replaced by a table of class descriptors. These class descrip- 
tors contain virtually the same information as Sun class 
descriptors, so an ACD serialization has the same portability 
characteristics as Sun serialization. To convert between ACI 
and ACD serializations is a very simple and computationally 
frugal process. Because both are otherwise identical (apart 
from the class identifier tables), only the class table contents 
need change. The environment maintains a correspondence 
between the ACI class identifiers and ACD class descriptors. 
3. Grammar of Serialization 

Rules are provided for object serial representation as 
shows below, using (typical) grammar notation. 
Rule 1: [X] indicates that X is optional 
Rule 2: [X ... ] indicates 0 or more occurrences of X 
Rule 3: X [. . . ] indicates 1 or more occurrences of X 
Rule 4: X |Y indicates X or Y must occur 
Rule 5: Ox is used to precede a hexadecimal literal value 
Rule 6: (datatype) indicates the java type of the following 

token 

Rule 7: *C indicates an ASCII character literal 

Rule 8: (utf8) indicates a UTF8 string encoding 

Using the above notation, the following object serialization 

fields are defined. 



object-serialization: object-serial-type classdesc-table object-table 
serial-type: serial-type-classid- header | serial-type-classdesc-header 
serial- type-classid- header: (byte) 0x20 
serial- typc-classdcsc- header: (byte) OxCO 
classdesc-table: classid [...] null-classid 

| classdesc [...] null-classdesc 
classid: compact-int 
null-classid: (byte) 0x0 
object-table: object [...] null-object 
null-object: null-classid 

object: class -object | simple -object j array-object 
class -object: proxy- classid proxy-classid 
simple-object: object-piece [...] 
object-piece: proxy-classid object-piece-data 
object-piece-data: field-data [...] 
field-data: primitive- field-data 

[ object-proxyid 
primitive- field-data: boo lean- fie ld-data 
| char- field-data 
j byte-field-data 
| short-field-data 
j int- field -data 
j float-field-data 
| long- fie ld-data 
| double-field-data 
boolean-field-data: (boolean) 
char-field-data: (char) 
byte- fie ld-data: (byte) 
short- field-data: (short) 
int-field-data: (int) 
float-field-data: (float) 
long- field- data: (long) 
double- field -data: (double) 
array-object: primitive- array- object 

I object- array -object 
primitive-array-object: primitive-type array-size primitive-array-data 
primitive- type: (byte) 0x5 // boolean 
| (byte) 0x6 // char 
j (byte) 0x7 // float 
| (byte) 0x8 // double 
j (byte) 0x9 // byte 
I (byte) OxA // short 
j (byte) OxB // int 
|(byte)0xC //long 
array-size: compact-int 
primitive- array-data: bootean-array-data 
| char-array-data 
byte-array-data 
' short-array-data 
int-array-data 
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-continued 



| float-array-data 

j long-array-data 

j double-array-data 
boolean-array-data: boolean-field-data [...] 
char- array-data: char- field -data [...] 
byte-array-data: byte-field-data [...] 
short-array-data: short- field- data [...] 
int-array-data: int-field-data [...] 
float-array-data: float- fie ld-data [...] 
long-array-data: long- field-data [...] 
double-array-data: double-field-data [...] 

object-array-object: object- type object-array-class_signature array-size 

object-array-data 

object-type: (byte) 0x1 

object-array-class„signature: '[' [...] { primitive-signature 

| 'L' proxy-classid } '[0]' 
primitive-signature: 'Z* // boolean 

) 'C // char 

j "F // float 

j 'D' // double 

j 'B' // byte 

j 'S' // short 

| T // int 

| 'J* // long 
object-array-data: object-proxyid [...] 
classdesc: classdesc-serial-type class- name class -flags 
total-class-members data-member [...] 
class-name: (utf8) 
member- name: (utfS) 

data- member: member- name { primitive-data- member | object- 
data-member } 

primitive-data- member: primitive-type 
object-data- member: object- type object-class- name 
classdesc-serial-type: 0x80 
null-classdesc: classdesc-serial-type 70* 



The compact-int rule is used to indicate a format for 
storing numbers efficiently. As will be explained, proxy- 
classid and object-classid will, in general, correspond to 
relatively small numbers; however, there is no limit to how 
much they can grow. Using a fixed size (e.g., four bytes) for 
storing these identifiers would entail wasted space for the 
normally small identifiers. Using a smaller size (e.g., two 
bytes), on the other hand, would impose constraints on the 
maximum size of a serialized object. Instead, all identifiers 
are stored as compact numbers. In a compact-int, each byte 
in the quantity uses seven bits to represent the number, with 
the other bit set when there exists a following byte. A method 
is defined in the following pseudo-code. 



while( N > 0 ) { 

if( (N & ~0x7F) != 0 ) { 

WriteByte{ (N& 0x70 I 0x80 ) 
}else{ 

WriteByte( NT & 0x7F ) 

} 

N - N » 7; 



} 



Now, to understand a method for improved object 
serialization, assume the following notation. Let O be the 
object being serialized. Let R(O) be the set of all objects 
reachable from O. Let C(0) be the set containing the class 
and superclasses of O. If O is in fact a class object, then C(O) 
contains O t as well. Let C(R(0))={C(o) for each o in R(O)}. 
Let clid(C) be the class id of class C. Let proxy(C) be the 
proxy id of class C. The proxy id of a class is its ordinal 
position within the class table plus 0x10. The addition of 
0x10 is to allow distinguishing from the primilive-array- 
types. Let proxy (O) be the proxy id of object O. The proxy 
id of an object is its ordinal position within the object table. 
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Sun serialization 


54 bytes 


AC I serialization 


16 bytes 


ACD serialization 
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An improved method of object serialization may be database are known, so the class identifiers needed by ACI 

summarized by the following method steps, as illustrated in can be and are maintained within the database. 

FIGS. 3A-E. At the outset, at step 301, a byte representing Consider, for instance, the following simple class, 
the type of serialization is written, serial-type-classid- 

header. Next, at step 302, the classid-table is written using 5 

the following logic: for each class C in C(R(0)), the classid ~~. ! ~T~! : \ ~7 . . . . , r 

cr^ • •„ /V>\ ■ . j public class MyClass implements java.io.Sertalizable { 

of C, chd(C) is written. For each C, a proxy(C) is generated, int fieW1 . 

whose value is the position in the classid table, starting at int field2; 

position 1. The classid table is terminated by a null classid, P^Hc MyCiass( int ft, int n ) { 

o. io 

Now, for each object o in R(O), the object may be y 

streamed out as follows. As shown at step 303, the method } 

switches (i.e., branches) based on object type. If o is a — — — — — ^ — — 

primitive array, the method branches to step 311, to apply the M instance of M Class usin Sun serialization 

following substeps (substeps 31Lm, shown m FIG. 3B). 15 requires 54 bytes ^ same inslance unitized via ACI is 16 

(a) Write primitive array type; bytes, and by ACD is 36 bytes. These results are summarized 

(b) Write the size of the array; and by the following Table 1. 

(c) For each primitive element of o, write the element 
beginning with the 0th element. 2Q 

If, on the other hand, o is an object array, the method 
branches to step 312, to apply the following substeps 
(substeps 3l2a-d, shown in FIG. 3C). 

(a) Write object array type; 

(b) Write the signature of the array; 25 

(c) Write the size of the array; and 

(d) For each object element p of o, write proxy(p), or 0 if 

p is null, beginning with the 0th element. F?u ua^i • K ■ * * p . • t 

if - .JL ~~*u~a w™«i™ tn 111 t~ Although ACI is very efficient to use for storing Java 

If o is a class object, the method branches to step 313, to ^ environment like a database? there 

apply the followmg substeps (substeps 313a-/,, shown in J ^ where aQ ^ ^ ^ confines Qf ^ 

MO. 3D). an( j nence mere a necessity for a portable format. The best 

(a) Write proxy( class(o)); and example of the need for such a portable format is database 

(b) Write proxy(o). replication. In replication it is necessary to transfer data from 
Otherwise, the method branches to step 314, to apply the 35 one database to another. Since class identifiers are unique 
following substeps (substeps 3l4a-b, shown in FIG. 3E). only to a particular database, class identifiers in one database 

(a) Write the proxy( class(o)); ^ likel y not correspond to the same classes in another 
/u\ u i i i /* \ c * ^- database. The ACD serialization provides the necessary 

(b) For each class or superclass class(i,o) of o, starting ^ ^ {q ^ tabm £ 

from the most derived class: ^ problem of replication has ^ other interesting 

For each senahzable ficldf of class(i,o): 40 characleristics . Replication is concerned with syncing data- 

Iff is a primitive typed field, write the corresponding base data across mu i ti p le databases, so replication most 

value in o; often just replicates database changes. These changes are 

Otherwise, f must be an object typed field. Therefore, m ost efficiently drawn from the database log file. In fact, 

let p be the corresponding object value in o. If p replication does not even need to communicate with the 

is null, write 0, else write proxy(p). 45 database engine. Replication needs to only understand the 

The method concludes by terminating the object-table with log file. 

the null proxyid, 0, as shown by step 304 (FIG. 3 A). In the system of the present invention, the system's log 

D. Practical use and test results file stores Java object serializations in ACI format, but 

The SQL employed in a DBMS may be extended to allow replication requires them in ACD format. This is where the 

the installation of Java classes into a database. For instance, 50 class table beginning ACI and ACD are particularly advan- 

the database engine Sybase Adaptive SQL Anywhere (ASA) tageous. Class descriptors are also stored in the log file, so 

includes a Java VM, thus allowing Java to be invoked from as a replication process scans a log file, it builds up a list of 

SQL and run in the context of the database engine. In known class descriptors with their corresponding class 

addition, database table columns can be created with type identifiers, and replaces the ACI class table with a ACD class 

corresponding to Java types, allowing the storage of Java 55 table in every Java object serialization. Hence, without even 

objects in the database. Database data is generally saved in requiring a running Java VM, Java objects can be easily 

persistent stores, so ASA may store its Java objects in the transformed from one format to the other, 

database using a serialization of the object. While the invention is described in some detail with 

A compact serialization for storing Java objects was specific reference to a single preferred embodiment and 

preferred. Database data is generally kept as compact as 60 certain alternatives, there is no intent to limit the invention 

possible within reason. Clearly, the amount of compaction to that particular embodiment or those specific alternatives, 

must be weighed against the time required to do the com- Thus, the true scope of the present invention is not limited 

pacting. Compact data leads to less space required to store to any one of the foregoing exemplary embodiments but is 

the data, and less I/O time required to read and write the instead defined by the appended claims, 

data. The absence of class descriptor information makes ACI 65 What is claimed is: 

a much more compact serialization than Sun. Within the 1. In a system comprising a computer network having a 

database environment, all classes ever installed into the database server and a client, an improved method for allow- 
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ing a client to retrieve an object stored in a database table 
residing on a database server, the method comprising: 

providing a streaming protocol for transferring objects 
from the database server to the client; 

receiving from the client a request for serialization of a 
particular object for transferring the particular object 
from the database server to the client, wherein said 
particular object is a Java object comprising at least one 
class, and wherein said particular object is stored in a 
relational database table at the database server; 

in response to the request, creating a class identifier for 
uniquely identifying each class from which the particu- 
lar object is derived that is already known to the 
system, thereby supporting conversion of the particular 
object to and from a binary representation without 
transmitting class descriptor information; 

creating a serialization comprising a binary representation 
of the particular object suitable for streaming 
transmission, said serialization including a table of said 
class identifiers for the particular object; 

streaming the binary representation of the particular 
object from the database server to the client; and 

upon receipt of the streamed binary representation at the 
client, recreating at the client a copy of said particular 
object. 

2. The method of claim 1, further comprising converting 
said serialization into a portable serialization by: 

creating a class descriptor for each class from which the 
particular object is derived, for providing detailed class 
description information in the object serialization for 
making the serialization portable; 

for each class identifier of a given class, specifying a 
correspondence between the class identifier of the 
given class and a class descriptor for that class, wherein 
said class descriptor comprises information for con- 
verting the particular object to and from a binary 
representation when the given class is unknown to the 
system; and 

transforming said serialization into a portable serializa- 
tion by replacing said table of class identifiers with a 
suitable table of class descriptors. 

3. The method of claim 2, wherein said objects comprise 
Java objects and wherein said portable serialization com- 
prises Java-compatible serialization. 
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4. The method of claim 2, wherein said table of class 
identifiers requires substantially less storage than said table 
of class descriptors. 

5. The method of claim 2, wherein the correspondence 
5 between a class identifier of a given class and a correspond- 
ing class descriptor for that class is maintained by the 
system. 

6. The method of claim 1, wherein each class identifier 
comprises a numeric identifier. 

10 7. The method of claim 1, wherein each class identifier 
comprises a compact numeric identifier comprising a quan- 
tity of at least one byte value. 

8. The method of claim 7, wherein said compact numeric 
15 identifier comprises a variable- length numeric identifier 

wherein each byte of the identifier uses seven bits to 
represent a number quantity and one bit to indicate whether 
an additional byte follows for the identifier. 

9. The method of claim 1, wherein said objects comprise 
2Q Java objects derived from Java classes. 

10. The method of claim 1, further comprising: 
storing said serialization in a database table at the data- 
base server. 

11. The method of claim 1, wherein said request com- 
25 prises an SQL query received from the client. 

12. The method of claim 1, wherein said Java object 
includes instantiated Java class data members and class 
methods. 

13. The method of claim 1, wherein said client comprises 
30 a database application executing at a client machine. 

14. The method of claim 1, wherein said protocol com- 
prises a token-based protocol. 

15. The method of claim 1, wherein said particular object 
comprises a Java object stored as column data in a database 

35 table of the database server. 

16. The method of claim 1, wherein said serialization 
includes at its beginning said table of said class identifiers 
for the particular object. 

17. The method of claim 1, wherein said system maintains 
40 a table of classes known to the system. 

18. The method of claim 17, wherein a class identifier for 
a given class is created, at least in part, by basing the class 
identifier on an ordinal position of the given class in said 
table of classes. 

* * * * * 
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